A Web Crawler Framework for Revenue Management

نویسندگان

  • DANIEL MARTINS
  • ROBERTO LAM
  • FRANCISCO SERRA
چکیده

Smart Revenue Management (SRM) is a project which aims the development of smart automatic techniques for an efficient optimization of occupancy and rates of hotel accommodations, commonly referred to, as Revenue Management. To get the best revenues, the hotel managers must have access to actual and reliable information about the competitive set of the hotels they manage, in order to anticipate and influence consumer’s behavior and maximize revenue. One way to get some of the necessary information is to inspect the most popular booking and travel websites where hotels promote themselves and consumers make reservations and provide reviews about their experiences. This paper presents a web crawler framework to perform automatic extraction of information from those sites, to facilitate the (RM) process of a particular hotel. The crawler periodically accesses the targeted websites and extracts information about a set of features that characterize the hotels listed there. Additionally, we present the document-oriented database used to store the retrieved information and discuss the usefulness of this framework in the context of the SRM system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Big Data Warehouse Framework for Smart Revenue Management

Revenue Management’s most cited definitions is probably “to sell the right accommodation to the right customer, at the right time and the right price, with optimal satisfaction for customers and hoteliers”. Smart Revenue Management (SRM) is a project, which aims the development of smart automatic techniques for an efficient optimization of occupancy and rates of hotel accommodations, commonly r...

متن کامل

Prioritize the ordering of URL queue in Focused crawler

The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...

متن کامل

GeoWeb Crawler: An Extensible and Scalable Web Crawling Framework for Discovering Geospatial Web Resources

With the advance of the World-Wide Web (WWW) technology, people can easily share content on the Web, including geospatial data and web services. Thus, the “big geospatial data management” issues start attracting attention. Among the big geospatial data issues, this research focuses on discovering distributed geospatial resources. As resources are scattered on the WWW, users cannot find resource...

متن کامل

Improving the performance of focused web crawlers

This work addresses issues related to the design and implementation of focused crawlers. Several variants of state-of-the-art crawlers relying on web page content and link information for estimating the relevance of web pages to a given topic are proposed. Particular emphasis is given to crawlers capable of learning not only the content of relevant pages (as classic crawlers do) but also paths ...

متن کامل

Slug: A Semantic Web Crawler

This paper introduces “Slug” a web crawler (or “Scutter”) designed for harvesting semantic web content. Implemented in Java using the Jena API, Slug provides a configurable, modular framework that allows a great degree of flexibility in configuring the retrieval, processing and storage of harvested content. The framework provides an RDF vocabulary for describing crawler configurations and colle...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015